AITopics

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Virginia (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Industry: Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-10-2026, 14:48:31 GMT

d7488039246a405baf6a7cbc3613a56f-AuthorFeedback.pdf

dataset, similar task, similarity detection, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceDec-3-2025

BioArc: Discovering Optimal Neural Architectures for Biological Foundation Models

Fang, Yi, Xu, Haoran, Han, Jiaxin, Ding, Sirui, Wang, Yizhi, Wang, Yue, Wang, Xuan

Foundation models have revolutionized various fields such as natural language processing (NLP) and computer vision (CV). While efforts have been made to transfer the success of the foundation models in general AI domains to biology, existing works focus on directly adopting the existing foundation model architectures from general machine learning domains without a systematic design considering the unique physicochemical and structural properties of each biological data modality. This leads to suboptimal performance, as these repurposed architectures struggle to capture the long-range dependencies, sparse information, and complex underlying ``grammars'' inherent to biological data. To address this gap, we introduce BioArc, a novel framework designed to move beyond intuition-driven architecture design towards principled, automated architecture discovery for biological foundation models. Leveraging Neural Architecture Search (NAS), BioArc systematically explores a vast architecture design space, evaluating architectures across multiple biological modalities while rigorously analyzing the interplay between architecture, tokenization, and training strategies. This large-scale analysis identifies novel, high-performance architectures, allowing us to distill a set of empirical design principles to guide future model development. Furthermore, to make the best of this set of discovered principled architectures, we propose and compare several architecture prediction methods that effectively and efficiently predict optimal architectures for new biological tasks. Overall, our work provides a foundational resource and a principled methodology to guide the creation of the next generation of task-specific and foundation models for biology.

large language model, machine learning, natural language, (20 more...)

2512.00283

Country: North America > United States (0.46)

Genre:

Overview (0.92)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-2-2025, 18:52:19 GMT

57c0531e13f40b91b3b0f1a30b529a1d-AuthorFeedback.pdf

artificial intelligence, gradient, machine learning, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsAug-16-2025, 16:16:06 GMT

d7488039246a405baf6a7cbc3613a56f-Paper.pdf

f-celeba, knowledge transfer, learning, (13 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Virginia (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Industry: Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-16-2025, 16:15:56 GMT

d7488039246a405baf6a7cbc3613a56f-AuthorFeedback.pdf

dataset, similar task, similarity detection, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Azam, Ruhana, Vempaty, Aditya, Jagmohan, Ashish

Reflection-Based Memory For Web navigation Agents

arXiv.org Artificial IntelligenceJun-4-2025

Web navigation agents have made significant progress, yet current systems operate with no memory of past experiences -- leading to repeated mistakes and an inability to learn from previous interactions. We introduce Reflection-Augment Planning (ReAP), a web navigation system to leverage both successful and failed past experiences using self-reflections. Our method improves baseline results by 11 points overall and 29 points on previously failed tasks. These findings demonstrate that reflections can transfer to different web navigation tasks.

large language model, machine learning, natural language, (15 more...)

2506.02158

Genre: Research Report > New Finding (0.89)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Lizzo, Tyler, Heck, Larry

UNLEARN Efficient Removal of Knowledge in Large Language Models

arXiv.org Artificial IntelligenceAug-7-2024

Given the prevalence of large language models (LLMs) and the prohibitive cost of training these models from scratch, dynamically forgetting specific knowledge e.g., private or proprietary, without retraining the model has become an important capability. This paper proposes a novel method to achieve this objective called UNLEARN. The approach builds upon subspace methods to identify and specifically target the removal of knowledge without adversely affecting other knowledge in the LLM. Results demonstrate 96% of targeted knowledge can be forgotten while maintaining performance on other knowledge within 2.5% of the original model, significantly outperforming the discriminatory abilities of the previous state-of-the-art. A dual method called LEARN is also proposed for targeted knowledge addition. Results show LEARN can match the fine-tuning accuracy of Low-Rank Adaptation (LoRA) without adversely affecting similar tasks.

knowledge, matrix, subspace, (16 more...)

2408.0414

Country:

North America > United States > California (0.04)
Asia > Singapore (0.04)
North America > United States > Virginia (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (0.68)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Cristea-Platon, Tudor, Mazoure, Bogdan, Susskind, Josh, Talbott, Walter

On the benefits of pixel-based hierarchical policies for task generalization

arXiv.org Artificial IntelligenceJul-26-2024

Reinforcement learning practitioners often avoid hierarchical policies, especially in image-based observation spaces. Typically, the single-task performance improvement over flat-policy counterparts does not justify the additional complexity associated with implementing a hierarchy. However, by introducing multiple decision-making levels, hierarchical policies can compose lower-level policies to more effectively generalize between tasks, highlighting the need for multi-task evaluations. We analyze the benefits of hierarchy through simulated multi-task robotic control experiments from pixels. Our results show that hierarchical policies trained with task conditioning can (1) increase performance on training tasks, (2) lead to improved reward and state-space generalizations in similar tasks, and (3) decrease the complexity of fine tuning required to solve novel tasks. Thus, we believe that hierarchical policies should be considered when building reinforcement learning architectures capable of generalizing between tasks.

hierarchical policy, machine learning, reinforcement learning, (18 more...)

2407.19142

Country:

North America > United States > Massachusetts (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Lan, Michael, Barez, Fazl

Locating Cross-Task Sequence Continuation Circuits in Transformers

arXiv.org Artificial IntelligenceJan-13-2024

While transformer models exhibit strong capabilities on linguistic tasks, their complex architectures make them difficult to interpret. Recent work has aimed to reverse engineer transformer models into human-readable representations called circuits that implement algorithmic functions. We extend this research by analyzing and comparing circuits for similar sequence continuation tasks, which include increasing sequences of digits, number words, and months. Through the application of circuit analysis techniques, we identify key sub-circuits responsible for detecting sequence members and for predicting the next member in a sequence. Our analysis reveals that semantically related sequences rely on shared circuit subgraphs with analogous roles. Overall, documenting shared computational structures enables better prediction of model behaviors, identification of errors, and safer editing procedures. This mechanistic understanding of transformers is a critical step towards building more robust, aligned, and interpretable language models.

information, number word, sequence member, (16 more...)

2311.04131

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)